-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(spanner): fix negative values for max_in_use_sessions metrics #10449
Conversation
spanner/session_test.go
Outdated
if (sp.idleList.Len() + | ||
int(sp.createReqs)) != 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can this be on one line (I found it hard to read in the current form)
spanner/session_test.go
Outdated
sh.recycle() | ||
} | ||
|
||
for true { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we add an escape hatch to the loop (e.g. stop after X time)? Now a future bug could cause this to loop for ever, which is harder to debug than a test failure after for example 2 seconds.
spanner/session.go
Outdated
// Decrease the number of sessions in use. | ||
p.decNumInUseLocked(ctx) | ||
// Decrease the number of sessions in use, only when not from idle list. | ||
if !isExpire { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it would be better to add an additional argument to the remove(..)
method that explicitly says whether the session was in use or not. So something like this:
func (p *sessionPool) remove(s *session, isExpire bool, wasInUse bool) {
...
if wasInUse {
p.decNumInUseLocked(ctx)
}
}
In theory, it could be that this method is called in the future to remove sessions that have not expired, but that also were not in use at the time that they were being removed, and then we could re-introduce a similar bug as the one here. That is less likely with an explicit argument that clearly says what it is for.
208c629
to
53f7f2c
Compare
53f7f2c
to
78e81d5
Compare
spanner/session.go
Outdated
@@ -907,7 +907,7 @@ func (p *sessionPool) close(ctx context.Context) { | |||
|
|||
func deleteSession(ctx context.Context, s *session, wg *sync.WaitGroup) { | |||
defer wg.Done() | |||
s.destroyWithContext(ctx, false) | |||
s.destroyWithContext(ctx, false, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is deleteSession
only called for sessions that are in use at that moment?
Note that inUse
means that the session was checked out of the pool at the moment that this method is being called. So in this case it would mean that we are calling deleteSession(..)
for a session that was checked out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed the function to closeSession as it will only be called when doing application cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But won't that then mean that the metric will drop to a negative value for a (very) short time when the application is shutting down? Assume that the situation is that:
- The pool has 100 sessions.
- 10 of them are in use.
- The application shuts down and closes the client.
- The client closes all sessions and calls this method for all 100 sesssions.
- The inUseSessions metric drop from 10 to -90 for a very short time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wasInUse was set to false in latest commit, hence negative values should not appear, and graphs will just show last exported value.
@@ -197,7 +197,7 @@ func (sh *sessionHandle) destroy() { | |||
p.trackedSessionHandles.Remove(tracked) | |||
p.mu.Unlock() | |||
} | |||
s.destroy(false) | |||
s.destroy(false, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is destroy()
only called for sessions that are in use? I would expect that it could also be called for a session that is in the list of idle sessions, and in that case it was not in use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sessionHandle is the wrapper which is created only for transactions(to be used) so we can assume any call for destroy using sessionHandle was in use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@olavloite added comment for the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes it clearer, thanks!
(#10508) * fix(spanner): add debug log to print full stack trace when negative value happens * skip decrementing num_in_use metric count when session is destroyed from healthchecks.
Internal bug: b/343756862